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Abstract 

Background: Campylobacter jejuni is a potent bacterial patliogen culpable for diarrheal disease called 
cannpylobacteriosis. It is realized as a major health issue attributable to unavailability of appropriate vaccines and 
clinical treatnnent options. As other pathogens, C jejuni entails host cellular components of an infected individual to 
disseminate this disease. These host-pathogen interfaces during C. jejuni infection are complex, vibrant and 
involved in the nicking of host cell environment, enzymes and pathways. Existing therapies are trusted only on 
a much smaller number of drugs, most of them are insufficient because of their severe host toxicity or drug- 
resistance phenomena. To find out remedial alternatives, the identification of new biotargets is highly anticipated. 
Understanding the molecules involved in pathogenesis has the potential to yield new and exciting strategies for 
therapeutic intervention. In this direction, advances in bioinformatics have opened up new possibilities for the 
rapid measurement of global changes during infection and this could be exploited to understand the molecular 
interactions involved in campylobacteriosis. 

Methods: In this study, homology modeling, epitope prediction and identification of ligand binding sites has been 
explored. Further attempt to generate strapping 3D model of cytolethal distending toxin protein from C jejuni have 
been described for the first time. 

Results: CDT protein isolated from C jejuni was analyzed using various bioinformatics and immuno-informatics tools 
including sequence and structure tools. A total of fifty five antigenic determinants were predicted and prediction 
results of CTL epitopes revealed that five MHC ligand are found in CDT. The three potential pocket binding site are 
found in the sequence that can be useful for drug designing. 

Conclusions: This model, we hope, will be of help in designing and predicting novel CDT inhibitors and vaccine 
candidates. 
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Background 

Campylobacter jejuni is a prominent bacterial cause of 
enteric campylobacteriosis in the entire world [1]. Cam- 
pylobacter is extensively distributed in poultry; neverthe- 
less, cattle, pigs, sheep, and pet animals may also be a 
source of these microorganisms. This infection may be 
due to either eating of semi cooked meat or cross- 
contamination of ready-to-eat food at the time of 
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preparation or storage. C. jejuni-lmked enterocolitis is 
characteristically coupled with a local acute inflamma- 
tory response that involves intestinal tissue damage [2]. 
The genome of C. jejuni has been sequenced, yet only a 
few prospective virulence factors produced by C. jejuni 
are well considered [3]. 

Cytolethal distending toxins (CDT) are a class of het- 
erotrimeric toxins produced by C. jejuni and also by 
closely related spp., such as C, fetus, C. coli [4,5], Shigella 
[6] and Escherichia coli [7]. This toxin is rearward trans- 
ported across the golgi complex and the endoplasmic 
reticulum, and afterward translocated into the nuclear 
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compartment, where it applies the toxic activity [8]. The 
CDT comprises of three protein subunits namely CdtA, 
CdtB, and CdtC causes progressive cellular distention 
with ultimate cell death and have been proposed as viru- 
lence factors in the pathogenesis of C jejuni [9], These 
results suggest that the CDTs are involved invasion, sur- 
vival and internalization into the host cell [10-13]. Al- 
though CDT from C. jejuni has been studied and 
characterized in laboratory [14,15], but research on im- 
mune responses and pathogenesis of C jejuni remains 
unexploited. 

The progress in computational methods competent of 
predicting immune epitopes for B lymphocytes and T 
lymphocytes will facilitate the viewing of pathogens 
for immunogenic antigens. The epitope based vaccines 
encourage an immune response by presenting immuno- 
genic peptides unite to major histocompatibility complex 
to TCR [16]. Considering the unavailability of 3D struc- 
ture of CDT, it is challenging to select proper target that 
would lead to predict epitope and ligand binding sites in 
protein. Hence, this study aims to investigate the CDT 
of C. jejuni with special focus on the structural and 
functional aspects through bioinformatics approach. 
This study has important implications on the selection 
of CTL epitope, a critical step in the development of 
vaccines. 

Methods 

Sequence acquisition and analysis 

We have received the sequence of CDT of C. jejuni from 
the NCBI database by inserting query as "CDT C. 
jejuni". The sequence was saved in FASTA format and 
used for further analysis. The primary structure analysis 
was done by using expasy ProtParam (www.expasy.org). 
The secondary structure of the protein was computed 
using different servers like Jpred3, GOR-IV and SOPMA 
[17] to check the presence of alpha helix and beta plated 
sheets in the structure. To determine the possible func- 
tion of C. jejuni, the sequence was subjected to com- 
parative protein structure modeling in the different 
servers. 

3D-Model building and validation 

Cytolethal distending toxin sequence of C. jejuni (CDTCJ) 
[EDZ32284.1] was used to develop 3D structure through 
homology modeling because crystal or NMR structure 
of the CTD protein was not available in the Protein 
Data Bank (PDB). The 3D structure of the CDT protein 
was done using a restrained-based approach in Model- 
ler. The 3D model was generated using the ModWeb 
server that generates 3D models along with their confi- 
dence score (C-Score). The template selection for the 
homology modeling of the CDT protein was performed 
by submitting amino acid sequence of the target protein 



to ModWeb server [18]. The crystal structure of CDT 
from Haemophillus ducreyi (PDB ID:1SR4) was used as 
a template. After generating the 3D model, structure 
analysis and stereochemical analysis were performed 
using different evaluation and validation tools. The final 
model was validated by using SAVES online tool (http:// 
nihserver.mbi.ucla.edu/SAVES/). The Ramachandran 
plot was obtained using PROCHECK [19] and RAM- 
PAGE [20] which helped in evaluating backbone con- 
formation. Ramachandran plot was also used to check 
non-GLY residues at the disallowed regions. The verify 
3D and PROSA web tool [21] was used to determine 
Z-scores. The ERRAT was used to predict overall 
quality for model and quality of the model was assured 
using Z- scores. 

Epitope prediction of protein antigens 

SEPPA (Spatial Epitope Prediction of Protein Antigens) 
server at the Life Science and Technology School, 
Tongji University, Shanghai China, (http://lifecenter. 
sgst.cn/seppa/) was used to predict conformational B- 
cell epitope. 

The 3D protein structure predicted by Modeller was 
used as an input, each residue in the query protein will 
be given a score according to its neighborhood residues 
information. Higher score corresponds to higher prob- 
ability of the residue to be involved in an epitope [22]. 
The default values of THRESHOLD was set at 1.80, this 
help to specify the epitope residues [23]. Transmem- 
brane topology of the CDTCJ protein was checked using 
TMHMM online tool [24] and antigenicity of protein 
was checked using SVMTriP online antigen prediction 
server [25]. The several algorithms are available that can 
predict the location and binding specificity of CTL epi- 
topes in the protein sequences. In this study, the cyto- 
toxic T-lymphocyte epitope prediction was done using 
NetCTL-1.2 server [26]. 

Sub cellular localization prediction 

The sub cellular localization of CDT was predicted using 
CELLO, an approach based on multi-class SVM classifi- 
cation system [27]. CELLO uses four types of sequence 
coding schemes: the amino acid composition, the di- 
peptide composition, the partitioned amino acid com- 
position and the sequence composition based on the 
physico-chemical properties of amino acids. Targe tP 1.1 
server was also used to predict cleavage site prediction 
of CDT [28]. 

Protein interaction network mapping 

Protein-protein interactions were achieved from the 
STRING database [29] comprising known and pre- 
dicted physical and functional protein-protein interac- 
tions. STRING in protein mode was used, and only 
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interactions with high confidence levels (>0.7) were 
kept. STRING quantitatively integrates interaction data 
from these sources for many organisms, and transfers 
information among these organisms where applicable. 
Network visualization was done with the Cytoscape 
software [30]. 

Ligand binding sites prediction 

We used MetaPocket 2.0 server (http://metapocket.eml. 
org) to identify ligand-binding sites on the protein sur- 
face. The MetaPocket is a consensus method [31] devel- 
oped at Technical University of Dresden and Zhejiang 
University jointly, in which the predicted binding sites 
from eight methods Le,, PASSU (PAS), LigsiteCS (LCS), 
Q_SiteFinder (QSF), GHECOM (GHE), POCASA (PCS), 
Fpocket (FPK), SURFNET (SEN), ConCavity (CON) are 
combined to improve the prediction success rate. 

Structure comparison 

The structure comparison was executed by using DaliLite 
server [32]. 

Results and discussion 

The current study was originated to perform structure 
based sequence analysis of the CDT protein isolated 
from C jejuni. The protein sequence was obtained from 
the NCBI protein database using accession number gi| 
205345645|gb|EDZ32284.1| cytolethal distending toxin 
[Campylobacter jejuni]. Primary structure analysis re- 
vealed that the CDT protein (268 aa) had a molecular 
weight of 29.94 kD and theoretical isoelectric point (PI) 
6.81. An isoelectric point indicates a negatively charged 
protein. The instability index (II) was 18.60, thereby cat- 
egorizes the protein as a stable. The aliphatic index ap- 
peared as 84.10 and the N-terminus of the sequence 
showed the presence of M (Met). The negative grand 
average of hydropathicity (GRAVY) of -0.061 denoted 



that the protein was hydrophillic. The amino acids, Asn 
(N), Phe (F), Ala (A), and Leu (L), were found in high 
praportion in the protein. The secondary structure dis- 
closed the presence of 8.21% a-helices, 4.85% |3-turns, 
25.37% extended strand and 61.57% coils (Figure 1). 
Transmembrane topology of the CDTCJ protein was 
checked using TMHMM online tool. The TMHMM ser- 
ver showed that residues 23-268 presented outside re- 
gion, residues 5-22 were within the transmembrane and 
residues 1-4 were inside the region of the protein. Hy- 
dropathy analysis of CDTCJ protein of C. jejuni by the 
TOPCONS [33], Signal P-4.0 [34] and TMHMM pro- 
grams suggested the presence of only one TM helix. We 
therefore localized the N terminus of CDTCJ in the 
cytoplasm. A consensus predicted topology is presented 
in Figure 2. 

The sub cellular localization of CDT was predicted 
using CELLO, an approach based on a two-level support 
vector machine (SVM) system. This server predicts sub 
cellular localization of protein for Gram negative bac- 
teria by supporting vector machines based on n-peptide 
compositions. The CELLO output gave significant reli- 
ability for outer membrane (0.198), periplasmic (1.76) 
extracellular (0.803) and cytoplasmic (2.493), it indicates 
that the protein is cytoplasmic. 

Model function and validation 

To determine the possible function of CDT, the se- 
quence was subjected to comparative protein structure 
modeling using the target protein sequence as query for 
different servers described in Methods. The modeling of 
CDTCJ was performed using a restrained-based ap- 
proach implemented in MODWEB [35] and significant 
hits were obtained. A set of three models for CDT pro- 
tein was constructed. The 3D structure of a CDTCJ 
protein was developed from the X-ray structure of 
Haemophilus ducreyi (PDB ID: 1SR4 Chain A, at 2.0 A 
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Figure 2 Transmembrane topology of CDTCJ of C jejuni. 



resolution) as a template for homology modeling. The 
alignment coverage region for target residue (113-258) 
showed the 37% sequence identity with template 1SR4 
residue 75-219. The resulting 3D models of CDTCJ 
were sorted according to the scores calculated from 
discrete optimized protein energy (DOPE) scoring func- 
tion. The final model that shared the lowest root mean 
square deviation (RMSD), relative to the trace (Ca 
atoms) of the crystal structure was selected for further 
studies. The validation of the model was performed by 
accessing the quality of backbone conformation by 
PROCHECK for reliability. The perceived Ramchandran 
plot (Psi-Phi) pairs had 86.5% of residues in most favored 



regions, 11.1% core residues in additional allowed re- 
gions, 1.6% residues in generously allowed regions and 
0.8% residues in disallowed regions (Figure 3). These 
values indicated a good quality model. Whereas the 
crystal structure of Haemophilus ducreyi PDB ID 1SR4 
shows 89% residue in most favor region [36]. To 
characterize the model, structural motif and mechanis- 
tically important loops were assigned to build the final 
3D model of CDTCJ. The 3D model of CDTCJ using 
the template 1SR4, consist of two domains that encom- 
passes 8(3-sheets and 3a-Helices (Figure 4). Verify3D 
and ERRAT were also used to further assess the quality of 
the CDTCJ model. Verify3D analyzes the compatibility 
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Figure 4 Homology model of CDTCJ. 3D structure of CDTCJ 
protein visualized by UCSF CHIMERA visualizing tool. The cartoon 
representation of 3D modeled structure of CDTCJ using PDB ID: 
1SR4 shows helix (orange), sheets (purple) and loops (sky blue). 



of the model against its own amino acid sequence and 
results revealed that 59.86% of the residue had an aver- 
age 3d-lD score 0.2. VerifySD and ProSA gave good 
scores for overall model quality. However, the ERRAT 
validation of CDTCJ model indicated regions where the 
calculated errors were higher than expected that de- 
creases the overall quality score to 46.7%. 

Structure comparison analysis 

Comparative analysis of CDTCJ structure was performed 
using DaliLite v.3.3. server. This server is a network 



service for comparing protein structure in 3D and com- 
putes optimal and suboptimal structural alignments be- 
tween two protein structures. It helps in understanding 
the fundamental role of proteins and their functions. The 
structural similarity relationships among protein struc- 
tures allow users to infer the functions of newly discov- 
ered proteins [37]. The final refined model of CDTCJ 
was superimposed with template by using DaliLite. The 
superimposition of model to the template is shown in 
Figure 5. The result provided by DaliLite servers show 
the 851 alignments with compatible Z-score. The high- 
est Z-score for structure from PDB ID: 2F2F, 1SR4 was 
28.3, 27.5 and percent identity 38, 37 respectively. It is 
interesting to note that first two high Z-score proteins 
are 2F2F and 1SR4, were also used for the development 
of model 3D structure. 

Epitope prediction of protein antigens 

Potentially immunogenic regions of CDTCJ were pre- 
dicted by using the SEPPA server. This server analyses 
3D structures and aims at the division of antigens sur- 
face in epitopic and non epitopic patches on the basis of 
different propensity scores and solvent accessibility; they 
all rely on training datasets comprising resolved anti- 
body/antigen complexes [38]. A total of 55 epitopes were 
predicted from 146 aa using default threshold value 
of 1.80. The predicted epitopes visualized with JMOL 
in different renderings are shown in Figure 6. In this 
structure, tints from blue to red represent a rising 
antigenicity. Highlighted epitope residues were predicted 
and shown in red solid spheres. The prediction results 
are also displayed in a table and each, residue is listed se- 
quentially. The predicted epitope residues are highlighted 
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in yellow and the core residues are shown in lowercase. 
Antigenic epitopes that are preferentially recognized by 
antibodies that can help in the design of vaccine compo- 
nents and immuno- diagnostic reagents [39]. 

Cytotoxic T-Lymphocytes (CTL) epitopes 

Epitope predictors are routinely tested on large sets of 
epitopes derived from various pathogens. Schellens et al 
[40] identified eighteen new CTL epitopes out of a set of 
twenty two predicted CTL epitopes in HIV-1 using 
NetCTL. We screened all possible peptide fragments of 
9aa within a particular protein, and eliminated those 



fragments that cannot be correctly processed by either 
the proteasome, TAP or the MHC class I molecules. 
Prediction results of CTL epitopes revealed that five 
MHC ligands were found in CDT sequence having high 
e-value score are positioned at loCCFMTFFLYig, 39DT 
DPLKLGL47, 132AQGNWIWGY140, i7oKTNTCLNAYi78 
and 217IQAPITNLY225. These are the immunodominant 
epitopes restricted by MHC class I located arbitrarily in 
the protein sequence. This data indicate that CTL epi- 
topes in CDT are randomly distributed, and this distri- 
bution is similar to those of CTL epitopes in proteins 
from other proteomes. 
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Table 1 List of predicted interactive proteins with CDTCJ of C. jejuni 



Sr. No 


ID 


Protein name 


Amino acid residue 


Score 


1 


cdtB 


Cytolethal distending toxin, subunit B 


265 


0.920 


2 


cdtC 


Cytoletlial distending toxin, subunit C 


189 


0.897 


3 


cydB 


Cytoclirome d ubiquinol oxidase, subunit II 


374 


0.651 


4 


cydA 


Cytochrome d ubiquinol oxidase, subunit 1 


520 


0.651 


5 


Cj0080 


Hypothetical protein 


89 


0.651 


6 


Cje0079 


Hypothetical protein 


34 


0.628 


7 


LctP 


L lactate permease 


565 


0.614 


8 


cadF 


Fibronectin binding protein 


319 


0.569 


9 


pIdA 


Phospholipase A 


329 


0.517 


10 


rpoZ 


DNA directed RNA polymerase, Subunit omega 


74 


0.514 



Protein interaction network mapping 

To compute protein-interaction properties of the CDT, 
we used the search tool for the retrieval of interacting 
genes and proteins (STRING) database of physical and 
functional interactions [41]. The prediction of CDTCJ 
interactions using protein structural similarities permit 
to construct various candidates interactions with pos- 
sibly significant functional relevance. For this purpose, 
relation among the ten identified proteins was examined. 



The interaction network for genetically interacting pro- 
teins possibly related in function with C. jejuni is shown 
in Figure 7, and the detail information is presented in 
Table 1. Green lines indicate co-localization in genomes 
(likely operon structures), and blue lines indicate statisti- 
cally significant co-occurrence across multiple genomes. 
A graph of the CDTCJ network shows the identified 
CDTCJ-interacting proteins and phylogenomic profiling 
of CDT-related functions. 




Pocket 1 



Pocket 2 





Pocket 3 




Figure 8 The predicted potential binding sites in CDT protein of C. jejuni. Pocket color description are indicated as: red - MPK, actinium - 
PAS, magenta - QSF, potassium - FPK, wlieat - SFN, yellow - GHE, blue - CON and raspberry - PCS. The exact residue location information is given 
in Table 2. 
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Table 2 Predicted ligand binding site in residues 



Site no 


Residues 




Header binding site 1 


ILE_9Aii8A 


LEU_9/\126A 




ILE_9A208A 


LEU_9Aii6A 




ALA_9A164A 


MET_9A165A 




IIP Q A 1 1 7A 


1 PI! Q A 1 i^(^A 




ILE_9^234A 


CYS_9A216A 




PHE_9A163A 


ASP_9^162A 


Header binding site 2 


LEU_9/\116A 


THR_9A117A 




PRO_9/\255A 


ALA_9A125A 




GLY_9^123A 


PHE_9A256A 


Header binding site 3 


TRP_9/\136A 


TRP_9/^138A 




GLY_9^181A 


LYS_9/\233A 



Ligand binding sites 

The potential binding sites (PBS) of proteins are those 
residues or atoms, which bind to ligands directly on pro- 
tein surface; they are near to the ligand binding sites. 
After clustering the top three sites from different 
methods like PAS, QSF, FPK, SFN, GHE, CON, LCS, the 
MetaPocket 2.0 has predicted seven clusters for the pro- 
tein structure, but we have presented here three best 
score pockets sites (Figure 8). 

The first MetaPocket site (MPKl) consists of six 
pocket sites, the first pocket from GHECOM (GHE-1), 
the first pocket from LigisiteCS (LCS-1), the first pocket 
from Fpocket (FPK-1), the second pocket from PASS 
(PAS-2), the first pocket of Q-SiteFinder (QSF-1) and 
the first pocket from Concavity (CON-1) with total Z- 
score 11.06 and size of 6. The second MetaPocket site 
(MPK2) consists of four pockets, from SNF-1, FPK-2, 
QSF-3 and PAS-3 and the total Z-score is 7.61. The 
third MetaPocket site (MPK 3) consists of three pocket, 
from the second pocket of Q-SiteFinder (QSF-2), the 
third pocket from LigisiteCS (LCS -3), the third pocket 
from GHECOM (GHE-3) with total Z-score 2.90 and 
size of 3. Table 2 shows the potential binding sites from 
a predicted CDT protein of C. jejuni in residue. The 
header binding sites 1, 2 and 3 are designated for Meta- 
Pockets 1, 2, 3 respectively. In the case above, potential 
binding sites of three MetaPockets are given and they 
are shown in residue format with each line starting with 
'RESr. The residue described above is constructed in 
three parts: residue name, chain indicator and residue 
sequence number. 

Conclusions 

The purpose of the present study was to perform a glo- 
bal screening for new immunogenic HLA class I (HLA- 
I) restricted cytotoxic T cell (CTL) epitopes of potential 
utility as a vaccine candidate against campylobacteroisis. 
The five epitopes of CDTCJ were identified. It is 



TRP_ 


9A154A 


ILE_9Ai66A 


LEU_9/\175A 


TRP_ 


9A196A 


LEU_9/^198A 


LEU_9/^158A 


L I J_ 


QA1 Q7A 


\/AI QA7nfiA 
Vr\L y' ^ZUD 


1 Fl 1 QA7m A 


THR. 


_9A252A 


ASN_9A210A 


LYS_9A215A 


ASN 


_9A213A 


LYS_9A209A 


ILE_9^182A 


ASN 


_9A161A 


^R_9^159A 


PRO_9/\160A 


THR. 


_9A252A 


THR_9A253A 


PRO_9/\254A 


LEU_ 


_9A142A 


ARG_9/^152A 


LEU_9^119A 


LYS_ 


9A146A 


THR_9A257A 




VAL_ 


_9A231A 


PHE_9A232A 


ASN_9A180A 


GLY 


_9A179A 


ILE_9/\182A 


ILE_9^137A 



anticipated that, the peptide 170KTNTCLNAY178 can 
serve as novel potential vaccine candidate against diar- 
rhea. These results have important implications for the 
rational design of CTL epitope-based CDT campylobac- 
teriosis diagnostics and vaccines applicable to all ethnic 
groups. The presented research offered a backbone for 
understanding structural and functional insights of CDT 
protein. The additional experimental work is required to 
validate this epitope. The identification of ligand-binding 
sites is often the starting point for protein function an- 
notation and structure-based drug design. In this study, 
we identify three predicted potential binding sites in 
CDT protein of C. jejuni. These are active sites on pro- 
tein surface that performs protein functions. 
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